Towards an Italian Lexicon for Polarity Classification (polarITA): a Comparative Analysis of Lexical Resources for Sentiment Analysis
نویسندگان
چکیده
English. The paper describes a preliminary study for the development of a novel lexicon for Italian sentiment analysis, i.e. where words are associated with polarity values. Given the influence of sentiment lexica on the performance of sentiment analysis systems, a methodology based on the detection and classification of errors in existing lexical resources is proposed and an extrinsic evaluation of the impact of such errors is applied. The final aim is to build a novel resource from the filtering applied to the existing lexical resources, which can integrate them with missing lexical entries and more reliable associations of polarity with entries. Italiano. L’articolo descrive uno studio preliminare per lo sviluppo di una nuova risorsa lessicale per la sentiment analysis in italiano, i.e. dove alle parole sono associati valori di polarità. Data l’influenza dei lessici di sentiment sulle performance dei sistemi di sentiment analysis, viene proposta una metodologia basata sulla rilevazione e classificazione degli errori presenti nei lessici attualmente disponibili ed una valutazione estrinseca dell’impatto di tali errori sui sistemi. L’obiettivo finale è ottenere un nuovo lessico grazie ad un filtraggio applicato alle risorse lessicali disponibili, e a un’integrazione con le voci lessicali mancanti, ottenendo una maggiore affidabilità nell’associazione delle
منابع مشابه
A Survey of Sentiment Classification Techniques Used for Indian Regional Languages
Sentiment Analysis is a natural language processing task that extracts sentiment from various text forms and classifies them according to positive, negative or neutral polarity. It analyzes emotions, feelings, and the attitude of a speaker or a writer towards a context. This paper gives comparative study of various sentiment classification techniques and also discusses in detail two main catego...
متن کاملAdapting a Polarity Lexicon using Integer Linear Programming for Domain-Specific Sentiment Classification
Polarity lexicons have been a valuable resource for sentiment analysis and opinion mining. There are a number of such lexical resources available, but it is often suboptimal to use them as is, because general purpose lexical resources do not reflect domain-specific lexical usage. In this paper, we propose a novel method based on integer linear programming that can adapt an existing lexicon into...
متن کاملAcquiring an Italian Polarity Lexicon through Distributional Methods
Recent interests in Sentiment Analysis brought the attention on effective methods to detect opinions and sentiments in texts. Many approaches in literature are based on resources, such as Polarity Lexicons, which model the prior polarity of words or multi-word expressions. Developing such resources is expensive, language dependent, and linguistic sentiment phenomena are not fully covered in the...
متن کاملA Language Independent Method for Generating Large Scale Polarity Lexicons
Sentiment Analysis systems aims at detecting opinions and sentiments that are expressed in texts. Many approaches in literature are based on resources that model the prior polarity of words or multi-word expressions, i.e. a polarity lexicon. Such resources are defined by teams of annotators, i.e. a manual annotation is provided to associate emotional or sentiment facets to the lexicon entries. ...
متن کاملSentiment analysis on Italian tweets
We describe TWITA, the first corpus of Italian tweets, which is created via a completely automatic procedure, portable to any other language. We experiment with sentiment analysis on two datasets from TWITA: a generic collection and a topic-specific collection. The only resource we use is a polarity lexicon, which we obtain by automatically matching three existing resources thereby creating the...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017